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Abstract 

Schema  evolution  support  is  an  important  facility  for  object-oriented  database 
(OODB)  systems.  While  existing  OODB  systems  provide  for  limited  forms  of 
evolution,  including  modification  to  the  database  schema  and  reorganization  of 
affected  instances,  we  find  their  support  insufficient.  Specific  deficiencies  are  1) 
the  lack  of  compatibility  support  for  old  applications,  and  2)  the  lack  of  ability  to 
install  arbitrary  changes  upon  the  schema  and  database. 

This  paper  examines  the  limitations  of  existing  schemes,  and  offers  a  more  gen¬ 
eral  framework  for  specifying  and  reasoning  about  the  evolution  of  class  definitions 
and  the  adaptation  of  existing,  oersistenl  instances  to  those  new  definitions. 
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Introduction 


Database  systems  exist  to  support  the  long-term  persistence  of  data.  It  is  natural  to 
expect  that,  over  time,  needs  will  change  and  that  those  changes  will  necessitate  a 
nnodification  to  the  interface  for  the  persistent  data.  In  an  object-oriented  database 
(OPDB)  system,  such  a  situation  would  motivate  an  evolution  of  the  database  schema. 
For*  this  reason,  support  for  schema  evolution  >s  a  required  facility  in  any  serious 
06dB  system. 

An  OODB  database  schema  consists  of  class  definitions  and  an  inheritance  hierarchy. 
A  class  is  a  tuple  of  methods  and  attributes.  The  database  is  populated  by  instances  of 
those  classes,  with  values  for  each  of  the  attributes.  The  schema  describes  the  interface 
between  the  set  of  application  programs  and  the  persistent  repository  of  objects.  When 
the  schema  changes  so  does  the  interface,  possibly  leaving  incompatible  elements  on  both 
sides  of  the  barrier.  We  are  interested  in  the  problem  of  managing  existing  database 
object'  'hat  we  call  the  instance  adaptation  problem.  This  paper  examines  the 
limit»  f  i  of  existing  adaptation  schemes  and  offers  a  more  general  framework  for 
specifying  and  reasoning  about  the  evolution  of  class  definitions  and  the  adaptation  of 
e.'dsting,  persistent  instances  to  those  new  definitions. 


Two  general  instance  adaptation  strategies  have  been  identified  and  implemented 
by  various  OODB  systems.  The  first  strategy,  conversion,  restructures  the  affected 
instances  to  conform  to  the  representation  of  their  modified  classes.  Conversion  is 
supported  by  the  ORION(2, 13]  and  GemStone[5]  systems. 

The  primary  shortcoming  of  the  conversion  approach  is  its  lack  of  support  for  pro¬ 
gram  compatibility,  By  discarding  the  former  schema,  application  programs  that  - 
formerly  interacted  with  the  database  through  the  changed  parts  of  the  interface  are 
now  obsolete.  This  is  an  especially  significant  problem  when  modification  (or  even 
recompilation)  of  the  application  program  is  impossible  {e.g.,  commercial  software). 

Rather  than  redefining  the  -ichema  and  converting  the  instances,  the  second  strategy, 
emulation,  is  based  on  a  class  versioning  scheme.  Each  class  evolution  defines  a 
new  version  of  the  class,  and  old  class  definitions  persist  indefinitely.  Instances  and 
applications  are  associated  with  a  particular  version  of  a  class,  and  the  runtime  system 
is  responsible  for  simulating  the  semantics  of  the  new  interface  on  top  of  instances 
of  the  old,  or  vice  versa.  Since  the  former  schema  is  not  discarded  but  retained  as 
an  alternate  interface,  the  emulation  scheme  provides  program  compatibility.  Such  a  Aca., 
facility  has  been  developed  for  the  Encore  system.  [18]  ;  VTIS  QRAki 
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Encore  pays  for  this  additional  functionality  with  a  loss  in  runtime  efficiency.  Under  i  ^  ’ 

a  conversion  scheme,  the  cost  of  the  evolution  is  a  function  of  the  number  of  affected  !  . 
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newly-created  one.  However,  the  cost  of  emulation  is  paid  whenever  there  is  a  version 
conflict  between  the  application  and  a  referenced  instance. 

We  feel  however  that  program  compatibility  among  schema  versions  is  a  very  de¬ 
sirable  feature.  It  can  be  of  great  utility  in  situations  where  the  database  is  shared  by 
a  variety  of  applications,  as  in'  Computer-Aided  Design  or  Office  Automation  Systems, 
when  the  database  acts  as  a  common  repository  for  information,  accessed  by  a  variety 
of  applications. 

Our  scheme  supports  program  compatibility  by  maintaining  multiple  versions  of  the 
database  scheme.  Old  programs  can  continue  to  interact  with  the  database  (on  both 
new  instances  and  old)  using  the  former  interface.  Rather  than  emulating  the  evolved 
semantics  all  at  runtime,  efficiency  is  gained  by  representing  each  object  as  an  instance 
of  each  version  of  its  class.  In  this  manner,  our  system  affects  a  compromise  between 
the  functionality  of  emulation  and  the  efficiency  of  conversion. 

Another  failing  common  to  the  conversion- bcised  evolution  facilities  is  the  limita¬ 
tions  placed  on  the  variety  of  schema  evolutions  that  can  be  performed.  Most  of  the 
existing  systems  restrict  admissible  evolutions  to  a  predefined  list  of  schema^  change 
operations  {e.g.,  adding/ deleting  an  attribute  or  method  from  a  class,  altering  a  class’s 
inheritance  list).  The  length  of  this  list  might  vary  from  system  to  system,  but  they  are 
all  similar  in  the  way  they  support  change:  The  set  of  changes  that  can  be  performed 
are  those  which  require  either  a  fixed  conversion  of  existing  instances  or  no  instance 
conversion  at  all.  Unfortunately,  change  is  inherently  unpredictable.  .4  desired  evolu¬ 
tion  is  sometimes  revolutionary  and  under  such  circumstances,  these  systems  prevent 
the  database  programmer  from  performing  the  desired  changes. 

We  are  interested  in  supporting  evolution  in  a  liberal  rather  than  a  conservative 
fashion;  rather  than  the  system  offering  a  lis^  of  possible  evolutions  to  the  programmer, 
the  programmer  should  be  able  to  specify  arbitrary  evolutions  and  rely  on  the  system 
for  assistance  and  verification.  Change  is  a  natural  occurrence  in  any  engineering  task, 
and  engineering-support  systems  should  help  rather  than  hinder  when  an  evolution  is 
required. 

Encore’s  emulation  facility  restricts  the  breadth  of  class  evolution  that  can  be  in¬ 
stalled,  but  the  restrictions  are  of  a  different  form.  Since  instances,  once  created,  cannot 
change  their  class- ver.‘ ion,  evolutions  that  retiuire  additional  storage  to  be  associat(>d 
with  each  instance  cannot  be  defined,  (cf,  p.o  for  details.) 

'Tlirouglioul  this  paper,  iciteina  refers  to  lire  complete  collection  of  <  l.ass  ileliiiitioiis  aiul  i/oss  u  leis 
to  one  a  particular  type. 
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In  the  remainder  of  this  paper,  we  present  a  model  for  specifying  schema  evolutions 
and  instance  adaptation  strategies.  Our  system  supports  program  compatibility,  accepts 
a  larger  variety  of  evolutions  than  existing  systems,  and  supports  a  variety  of  options 
to  make  it  more  efficient  than  the  pure  emulation  facility  of  Encore. 


Related  work 


Before  describing  our  system  in  detail,  we  present  a  more  detailed  description  of  im¬ 
portant  existing  systems. 


ORION 

The  most  ambitious  and  effective  example  of  a  schema  evolution  support  facility  is 
that  provided  by  ORION  (2, 13, 14].  ORION  provides  a  taxonomy  of  schema  evolution 
operations.  It  also  defines  a  database  model  in  the  form  of  invariants  that  must  be 
preserved  across  any  valid  evolution  operation  and  a  set  of  rules  that  instruct  the  system 
how  best  to  maintain  those  invariants.  Under  this  model,  a  schema  designer  specifies  an 
evolution  in  terms  of  the  taxonomy,  the  system  verifies  the  evolution  by  determining  if 
it  is  consistent  with  the  invariants  and  then  adjusts  the  schema  and  database  according 
to  the  appropnate  rules. 

ORION  can  only  perform  those  evolutions  for  which  it  has  a  rule  defined.  The  set 
of  rules  is  fi.xed.  For  example,  changes  to  the  domain  of  an  attribute  of  a  class  are 
restricted  to  generalizations  of  that  domain.  This  restriction  exists  because  there  is 
no  facility  in  ORION’s  evolution  language  for  defining  a  procedure  that  to  be  used  by 
the  system  to  convert  old  instance  values.  Generalizations  of  the  attribute  domain  are 
allowed  since  this  evolution  does  not  require  existing  instances  to  be  modified. 

In  ORION,  evolutions  are  performed  on  a  unique  schema.  Instances  are  converted 
under  a  lazy  conversion  scheme;  that  is,  they  are  not  converted  when  the  evolution  is 
declared,  but  instead  converted  when  they  are  referenced.  Under  this  scheme,  there  is  no 
compatibility  support  for  old  programs  and,  depending  on  the  evolution,  information 
contained  in  the  instances  might  be  lost  at  conversion  time,  {e.g.,  Deletion  of  an 
attribute.) 


GemStone 

like  ORION,  GemStone  supports  a  set  of  evolution  operations.  It  is  distinguished  by 
its  employment  of  an  eager  conversion  scheme,  converting  affected  instances  when 
the  evolution  is  specified.  This  scheme  has  the  advantage  that  no  runtime  support 
is  required  or  expense  incurred;  once  installed,  the  restructuring  of  the  database  is 
complete.  This  is  in  contrast  to  lazy  conversion,  which  requires  the  runtime  system  to 
check  for  the  existence  of  still-unconverted  instances  indefinitely.  On  the  other  liand, 
eager  conversion  makes  evolutions  very  time-consuming  to  install. 


Encore 


Encore  implements  emulation  via  user-defined  exception  handling  routines.  Whenever 
there  is  a  version  conflict  between  the  program  and  the  referenced  instance,  the  routine 
associated  with  that  method  or  instance  (and  those  pair  of  versions)  is  called.  The 
routine  is  expected  to  make  the  method’s  invocation  conform  to  the  expectations  of 
the  instance  or  make  the  return  value  from  the  method  invocation  consistent  with  the 
expectations  of  the  calling  program,  whichever  is  appropriate.  It  is  known,  however, 
that  certain  evolutions  cannot  be  modeled  adequately  under  this  scheme.  The  problem 
stems  from  the  fact  that  each  object  can  only  instantiate  a  single  version.  If  an  evolution 
includes  the  addition  (subtraction)  of  information  {e.g.,  the  addition  (deletion)  of  an 
attribute),  there  is  no  place  for  older  (newer)  instances  to  store  an  associated  value.  The 
best  a  programmer  could  do  in  such  a  system  is  associate  a  default  attribute  value  for 
all  instances  of  older  (newer)  type-versions  by  installing  an  exception  handling  routine 
to  return  the  value  when  an  application  attempts  to  reference  that  attribute  from  an 
old  (new)  instance.  [18] 

The  Common  Lisp  Object  System 

CL0S[19,  12],  while  not  an  OODB  system,  provides  extended  support  for  class  evolu¬ 
tion  nonetheless.  As  Common  Lisp  system  development  is  performed  in  an  interactive 
context,  class  redefinition  is  a  frequent  occurrence.  Rather  than  discard  all  existing 
instances,  CLOS  converts  them  according  to  a  policy  under  the  control  of  the  user. 
The  default  policy  is  to  reinitialize  attribute  values  that  no  longer  correspond  to  ihe  at¬ 
tribute  domain,  and  to  delete  attribute  slots  that  are  no  longer  represented  in  the  class 
definition.  Users  can  override  this  policy  by  defining  their  own  method  that  is  called 
automatically  by  the  system.  This  method  is  pas.sed  as  arguments  the  old  and  new 
slot  values,  so  relationships  between  deleted  and  added  attributes  can  be  enforced.]  19, 
p.859] 

Adaptation  and  Extension  of  (Relational)  Views 

Bertino[4]  presents  a  schema  evolution  language  which  is  an  OODB  adaptation  of  the 
view  mechanism  found  in  many  relational  database  systems.  Her  primary  innovations 
are  the  support  of  inheritance  and  object  IDs  (OlDs)  for  view  instances,  two  important 
characteristics  of  OODB  models  that  are  not  present  in  the  relational  ino>iel.  View 
instances  with  OIDs  are  physically  realized  in  the  databa.se,  enabling  the  view  mecha¬ 
nism  to  support  evolutions  that  specify  the  addition  of  an  attribute,  as  envisioned  In 
Zdonik['.M].  However,  Bertino's  scheme  focuses  on  how  evijluticns  .dfect  ihe  sclnuna.  It 


is  not  concerned  explicitly  with  the  effects  upon  the  instances  nor  with  compatibility 
issues. 

CLOSQL 

Monk’s  CLOSQL(16]  implements  an  class  versioning  scheme,  but  employs  a  conversion 
adaptation  strategy.  Instances  are  converted  when  there  is  a  version  conflict,  but  unlike 
ORION  and  GemStone,  CLOSQL  can  convert  instances  to  older  versions  of  the  class  if 
necessary. 

Lerner  and  OTGen 

Lerner’s  OTGen  designflb]  addresses  the  problem  of  complex  evolutions  requiring  ma¬ 
jor  structural  conversions  of  the  database  {e.g.,  information  moving  between  classes, 
sharing  of  data  using  pointers)  using  a  special-purpose  language  to  specify  instance 
conversion  procedures.  As  it  was  developed  in  an  integrated  database  context,  where 
the  entire  application  set  is  recompiled  when  schema  changes,  versioning  and  compati¬ 
bility  were  not  considered. 


♦ 


Conversion  and  Compatibility 

Schema  Modification  vs.  Class  Versioning 

The  schema  evolution  support  provided  by  such  systems  as  ORION  and  GemStone  is 
restricted  to  what  Kim  calls  schema  modification,  that  is,  the  direct  modification  of  a 
single  logical  schema.  (14]  When  only  one  database  schema  exists,  it  is  appropriate  for 
•the  system  to  convert  all  existing  instances.  From  a  database  consistency  perspective, 
it  must  appear^  that  all  Instances  have  been  converted  when  the  evolution  operation  is 
applied.  In  fact,  we  would  claim  that  it  is  the  only  sensible  approach. 

As  has  already  been  stated,  however,  conversion  might  render  the  instances  inac¬ 
cessible  to  applications  that  had  previously  referenced  them.  The  adaptation  strategy 
converts  the  instances  but  does  not  alter  procedural  references.  Thus,  application  pro¬ 
grams  written  and  compiled  under  the  o»d  schema  may  now  be  obsolete,  unable  to  access 
either  the  old,  now  converted,  instances,  or  the  ones  created  under  the  new  schema. 

A  reasonable  direction  of  research  here  would  be  to  provide  some  automated  mech¬ 
anisms  to  assist  with  program  conversion;  it  is  an  active  line  of  research.  (11,  1]  In 
the  OODB  context,  some  work  has  been  conducted  at  providing  support  to  alert  the 
programmer  about  the  procedural  dependencies  of  their  evolution  operation.  (8]  But 
this  is  not  the  only  possible  solution.  Rather  than  adjust  programs  to  conform  to  the 
data,  it  would  seem  easier  to  adjust  the  data  to  conform  to  the  existing  programs. 
Also,  it  is  not  always  possible  to  alter,  or  even  recompile,  programs  {e.g,,  commercial 
software).  This  lack  of  compatibility  support  is  our  primary  motivation  for  adopting  a 
class  versioning  design  for  evolution  management  and  support. 

Under  a  class  versioning  scheme,  multiple  interfaces  to  a  class,  one  per  version,  are 
defined.  When  compiled,  application  progra;'.is  are  associated  with  a  single  version  of 
each  of  the  classes  it  refers  to;  a  schema  configuration,  if  you  will.  With  the  database 
populated  with  Instances  of  multiple  versions  of  a  class,  the  runtime  system  must  resolve 
discrepancies  between  the  version  expected  by  the  application  and  that  of  the  referenced 
instance. 

Objects  Instantiating  Multiple  Versions 

Under  the  original  Encore  scheme(18],  instances  never  change  their  type- version.  Aware 
of  the  restrictions  this  causes  (c/,,  previous  section),  Zdonik  proposed  a  scheme  whereby 
an  existing  instance  can  be  “wrapped”  with  e.xtra  storage  and  a  new  interface,  enabling 
it  to  be  a  full-fledged  instance  of  a  new  type- version.  (21]  While  still  able  to  be  acces.sed 
through  its  original  interface/ version,  the  wrapped  object  can  also  bo  manipulated 

•^Whether  the  instances  ate  converted  eagerly  or  lazily  becomes  an  inipleinentation  issue. 
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Figure  1:  Zdonik’s  Wrapping  Scheme:  as  in  the  Encore  design,  multiple 
interfaces  to  the  class  are  preserved.  Here,  extra  space  is  allocated  for  the 
attributes  added  as  a  result  of  the  evolution,  and  applications  can  access 
the  instance  through  either  the  old  or  new  interfaces. 


through  the  new  Interface.  Thus,  if  the  class  evolution  specifies  the  addition  of  an 
attribute,  the  wrapping  mechanism  could  allocate  storage  for  the  new  slot  in  e.xisting 
instances,  without  denying  backward  compatibility,  (c/.,  Figure  1) 

Our  proposed  scheme  is  a  generalization  of  this  approach.  Much  like  each  .class 
has  multiple  versions,  each  instance  is  composed  of  multiple  facets.  Theoretically, 
these  multifaceted  instances  encapsulate  the  state  of  the  instance  for  all  the  defined 
Interfaces  (versions).  The  representation  of  these  instance  resembles  a  disjoint  union  of 
the  representation  of  each  of  the  versions,  and  it  is  useful  to  consider  the  representation 
as  exactly  that.  As  will  be  explained  later,  however,  the  process  of  actually  allocating 
and  initializing  the  facets  can  be  deferred  until  needed. 

As  an  example,  consider  a  class  Undergraduate,  originally  including  attributes  Name, 
Program,  and  Class,  and  a  new  version  of  the  class  with  the  attributes  Name,  Id  Number, 
Advisor,  and  Class  Year.  (Class  is  one  of  {Freshman,  Sophomore,  Junior,  Senior},  while 
Class  Year  is  the  year  the  student  is  expected  to  graJuate.)  Program  is  the  degree 
program  in  which  the  student  is  enrolled,  and  Advisor  is  his  academic  advisor.  While 
instances  of  Undergraduate  in  the  database  will  contain  all  seven  distinct  attribute  slots, 
any  particular  application  will  be  restricted  to  one  version  and  thus  only  have  explicit 
access  to  one  facet. 

In  reasoning  about  the  relationship  between  any  two  versions^  of  a  class,  it  is  useful 
to  divide  the  attributes  into  these  four  groups: 

Shared:  when  an  attribute  is  common'*  to  both  versions. 


®Fot  explanatory  purposes,  imagine  that  we  are  ilescribing  a  class  consisting  of  only  two  \ersions. 
and  where  the  database  is  populated  by  instances  of  both. 

^Common  in  the  semantic  sense,  not  just  having  the  same  name  or  type. 
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shared  attribute 


derived  attribute 


-  •  -  dependent  attribute 


Figure  2:  Disjoint. union  representation  of  the  versioned  class  Undergraduate 

Independent:  when  an  attribute’s  value  cannot  be  aflected  by  any  modifications  to 
the  attribute  values  in  the  other  facet. 

Derived:  when  an  attribute’s  value  can  be  derived  directly  from  the  values  of  the 
attributes  in  the  other  facet, 

Dependent:  when  an  attribute’s  value  is  affected  by  changes  in  the  values  of  attributes 
in  the  other  facet,  but  cannot  be  computed  solely  from  those  values. 

In  our  example,  the  Name  attribute  is  shared  by  the  versions,- while  Id  Number  is 
independent.  Class  and  Class  Year  are  both  derived  attributes,  since,  given  the  current 
date,  it  is  possible  to  derive  one  from  the  other.  Advisor  is  a  dependent  attribute, 
since  a  change  in  Program  might  necessitate  a  change  in  advisor.  Likewise,  Program 
is  a  dependent  attribute,  since  a  change  in  advisor  might  imply  t!;~!  ihc  student  has 
switched  degree  programs,  (c/.,  Figure  2) 

Zdonik  et  al.  [18,  21]  almost  always  cite  evolutions  involving  independent  or  derived 
attributes  in  their  examples.  The  original  Encore  emulation  scheme  is  adequate  for 
supporting  evolutions  that  introduce  shared  and  derived  attributes.  Zdonik’s  wrapping 
proposal  addresses  the  problems  associated  with  independent  attributes.  Our  proposed 
scheme,  however,  will  provide  a  mechanism  for  managing  chiss  evolutions  that  include 
dependent  attributes.  (See  Table  (p.ll)  for  a  comparison  of  the  evolution  capabilities 
of  various  systems.) 

Classes,  and  thus  cla.>s- versions,  are  made  up  of  methods  as  well  as  attributes.  .Most 
object-oriented  data  models  allow  for  the  specification  of  private  attributes  that  can 
only  be  manipulated  by  the  methods  of  the  class.  With  respect  to  class  evolution,  the 
addition,  deletion,  or  alteration  of  a  method  that  does  not  change  the  semantics  of  the 
component  instances  can  only  affect  the  database  schema,  and  thus  no  ailaptation  of 
existing  instances  is  required.  For  such  changes,  existing  schema  evolution  technologies 
perform  adequately. 
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Figure  3:  Some  simple  evolutions:  (a)  illustrates  the  relationships  following 
the  addition  of  an  independent  attribute,  (b)  shows  the  addition  of  a  derived 
attribute. 


Interacting  with  Multifaceted  Instances  (Example ) 

Consider  an  evolution  specification  language  which  can  categorize  attributes.  This  is 
accomplished  by  associating  e,xtra  information  with  each  attribute. 

Shared  attributes  have  the  name  of  the  corresponding  attribute  from  another  ver¬ 
sion, 

Derived  attributes  have  a  function  for  determining  its  value  in  terms  of  the  values 
of  the  attributes  in  the  other  version,  and 

Dependent  attributes  have  a  function  for  determining  its  value  in  terms  of  the 
values  of  the  attributes  in  both  versions  (i.e.,  the  entire  object),  and 

Independent  attributes  have  nothing  e.xtra  at  all. 

Representing  the  class  instances  as  a  disjoint  union  of  the  version  facets,  as  described 
earlier,  consistency  between  the  facets  can  be  maintained  according  to  the  following 
procedure: 

Whenever  an  attribute  value  of  a  facet  is  modified,  those  attributes  in  the 
other  facet  that  depend  on  it  must  be  updated.  For  shared  attributes,  the 
new  value  is  copied;  for  dependent  and  derived  attributes,  the  dependency 
functions  are  applied  and  the  result  written  into  the  (attribute)  slot  in  the 
other  facet. 
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Description  of  Evolution 

ORION 

Encore 

Bertino[4] 

Us 

Add/delete/rename  a  method 

^/ 

v/ 

v/ 

v/ 

A(Ld,.an  attribute 

s/ 

X 

n/ 

v/ 

Delete  an  attribute 

v/ 

v/ 

n/ 

Generalize  the  domain  of  an  attribuie 

^/ 

V 

v/ 

Change  (arbitrarily)  the  domain  of  an  attribute 

X 

s! 

/■ 

Telescoping*' 

v/ 

v/ 

y 

x^ 

Change  supertypes 

v/ 

7 

v/ 

X 

Table  Is  Some  evolutions  and  which  systems  support  them. 

“This  evolution  cannot  be  define  directly,  but  can  be  implementing  by  replacing  the  attribute  in  the 
new  version  with  generic  methods  for  reading  and  writing.[3] 

‘’Definition  appears  in  conclusions 
“c/,,  (p.l6)  for  clarification. 

The  remainder  of  this  section  consists  of  an  e.Kample. 


Consider  a  multifaceted  instance  of  the  Undergraduate  versioned  class,  rep¬ 
resented  graphically  as  follows: 


Imagine  that  John  Smith  returns  to  university  after  .his  first  summer  vaca¬ 
tion  and  wishes  to  change  to  the  undergraduate  Math  program.  .\lso,  he 
had  taken  some  summer  classes  that  have  given  him  onougli  credits  to  grad¬ 
uate  a  year  early.  The  change  to  his  data  record  are  recorded  through  an 
application  program  employing  the  fir-t  version  of  the  Undergraduate  class. 
The  system  must  now  propagate  those  modifications  to  the  second  facet. 

The  attributes  Class  and  Class  Year  can  be  derived  in  terms  of  each  of  other. 
A  reasonable  derivation  function  for  Class  Year  is: 

i:i 


ClassYear  = 


cy  +  3 
cy  +  2 
cy  + 1 
cy 


if  Year  =  Freshman 
if  Year  =  Sophomore 
if  Year  =  Junior 
if  Year  =  Senior 


Where  cy  is  the  current  year.  The  Advisor  attribute  is  dependent  upon  the 
value  of  the  Program  attribute,  but  not  completely  derivable.  A  reasonable 
dependency  function  is: 


Advisor  = 


Advisor  if  Advisor  6  Program  faculty 
nil  otherwise 


Since  there  is  not  enough  information  to  derive  it,  the  student’s  advisor  will 
have  to  be  filled  in  later. 

Applying  these  functions  in  concert  with  the  desired  changes  to  John  Smith’s 
record,  the  multifaceted  instance  becomes: 


Representing  Multifaceted  Instances 

In  the  previous  section  we  descilbed  the  semantics  of  our  schema  versioning  scheme.  In 
this  section  we  address  the  issue  of  how  to  reali/ie  these  multifaceted  instances  physically 
in  the  database. 

Our  basis  for  consideration  is  a  system  which  implements  the  design  as  described: 
class  evolutions  are  defined  by  creating  a  new  version  of  the  class;  new  facets  (cor¬ 
responding  to  the  new  version)  are  associated  with  every  int'-auce  of  the  class  aitd 
initialized  accoiding  to  a  user  defined  procedure.  Each  application  program  interacts 
with  the  instances  through  a  single  (interface)  version  and  modifications  to  attribute 
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slots  on  the  “visible”  facet  are  immediately  propagated  to  the  other  facets,  using  a 
mechanism  similar  to  the  trigger  facility  found  in  many  relational  and  AI  database 
systems  [20,  10]. 

The  most  obvious  target  for  improvement  in  this  scheme  is  how  new  facets  are  added. 
The  allocation  and  initialization  of  new  facets  for  existing  instances  at  evolution  time 
is  subject  to  some  of  the  same  criticisms  as  eager  conversion  (c/.,  p.4).  Thus,  it  is 
advantageous  to  defer  the  addition  of  the  new  facet  as  long  as  possible,  i.e.,  until  an 
application  program  attempts  to  reference  the  new  facet. 

The  sticitegy  of  deferring  the  actual  maintenance  of  a  dependency  constraint  until 
its  effect  is  actually  required  can  be  applied  as  well  to  the  propagation  of  information 
among  the  facets  of  an  instance.  Rather  than  update  the  attribute  values  of  the  other 
facet(s)  each  time  a  facet  attribute  is  modified,  one  need  only  bring  a  facet  up-to-date 
when  there  is  an  attempt  to  access  it.  This  scheme  can  be  supported  by  associating  a 
flag  with  sach  facet  indicating  whether  the  facet  is  up-to-date  with  respect  to  the  most 
recently  modified  facet.  Read  operations  on  facets  with  an  unset  flag  are  preceded  by 
a  resynchronization  operation,  which  performs  any  necessary  updates  and  sets  the  flag. 

This  scheme  reduces  overall  runtime  expense,  since  the  resynchronization  step  is  not 
performed  in  concert  with  every  update  operation,  as  was  previously  the  case.  However, 
it  does  increase  the  potential  cost  of  inexpensive  read  operations. 

To  this  point,  we  have  been  very  liberal  with  our  allocation  of  space  for  instance 
representation.  Although  the  lazy  allocation  of  facets  conserves  some  space  in  the  short 
run,  the  disjoint  union  model  implies  that  every  instance  of  a  versioned  class  will  have 
a  complete  collection  of  facets.  There  are  a  few  optimizations  that  could  be  performed 
to  reduce  space  requirements. 

The  first  space-saving  improvement  entails  having  each  set  of  shared  attributes 
occupy  a  single  slot  in  the  multifaceted  version.  A  performance  improvement  might 
also  be  realized  here,  since  slot  sharing  reduces  the  expense  and/or  frequency  of  uprlate 
propagations,  (c/.,  Figure  4  (p.l4).) 

Under  certain  circumstances,  the  slot  associated  with  a  derived  attribute  can  be 
recovered  as  well.  If  an  Inverse  procedure  to  the  derivation  function  is  known  to  the 
system,  then  the  attribute  can  be  implemented  using  the  appropriate  reader  <ind  writer 
methods  to  simulate  it.  For  many  evolutions,  tlie  inverse  procedure  appears  as  the 
derivation  function  for  the  related  attribute  in  the  other  facet.  I'he  Year  and  Class  Year 
attributes  in  our  example  (p.8)  are  related  in  that  way.  (c/.,  Figure  5  (p.l  l).) 

From  a  runtime  performance  perspective,  this  space  o|)timization  reduces  the  a.\- 
pen.se  of  write  operations  while  making  read  operations  more  costly.  The  slot  allocated 
for  a  derived  attribute  acts  as  a  cache  for  its  derivation  function  and.  depending  on 
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Figure  4:  Multifaceted  instance  representation  using  common  slot  for 
shared  attributes 
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Figure  5:  Multifaceted  instance  representation  minimizing  derived  at¬ 
tribute  allocation.  For  the  Undergraduate  class,  two  minimizations  exist. 


the  frequency  of  modifications  to  its  dependent  attributes  in  the  other  facot(s),  its 
maintenance  might  be  more  time-efficient. 

Forcing  an  Adaptation  Strategy 

While  an  important  feature  in  general,  program  compatibility  is  not  always  required 
(e.g.,  a  database  with  a  single  application  program  and  a  single  user).  In  such  .situations 
one  should  be  able  to  improve  performance  by  instructing  the  system  to  convert  fully 
the  existing  instances  and  discjvrd  (or  perhaps  archive)  the  old  information.  Further 
more,  conversion  and  compatibility  are  not  mutually  e.xclusive.  .\s  long  as  an  inverse 
conveision  procedure  is  known,  one  could  convert  and  emulate  the  former  version.  This 
might  be  .useful  when  you  want  to  preserve  compatibility,  but  expect  that  it  will  be 
needed  infrequently  enough  that  you  are  willing  to  pay  the  cost  of  emulation  in  thu.se 
instances.  If  it  is  the  case  that  an  applications  then  to  access  a  distinct  sub.set  of  the 
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instance  collection,  one  could  use  a  strategy  that  converts  (on  access)  instances  to  the 
version  of  the  application.  (This  is  the  approach  taken  by  Monk’s  CLOSQL  system[l6].) 

Sometimes,  modification  of  the  database  or  its  schema  is  impossible.  Databases 
might  be  read-only  for  permission  {e.g.,  remote  database  exported  as  a  public  service)  or 
licensing  reasons  {e.g.,  reference  materials  on  CD-ROM).  In  such  situations,  something 
resembling  Zdonik’s  wrapping  scheme  (c/.,  p.7)  must  be  used,  with  the  wrapper  actually 
residing  in  a  separate  database. 

Inheritance 

When  a  class  is  evolved,  it  may  not  only  its  direct  instances,  but  the  instances  of  its 
subclasses  as  well.  If  we  consider  class  instances  as  represented  by  slices,  each  slice 
instantiating  the  superclass,  then  it  is  easy  to  see  what  would  happen.  The  instance 
adaptation  scheme  that  applies  to  direct  instances  applies  to  the  appropriate  slices  in 
each  of  the  subclass  instances. 

However,  that  does  not  explain  how  our  system  handles  evolutions  that  affect  the 
inheritance  graph  of  the  schema.  Changes  to  the  inheritance  list  of  a  class  (i.e.,  addition 
or  deletion  of  a  superclass,  or  the  reordering  of  the  superclass  list)  can  be  viewed  as 
a  compound  evolution,  incorporating  many  additions  and  deletions  of  attributes.  The 
mechanisms  described  for  the  addition  or  deletion  of  single  attributes  work  similarly 
here. 

Support  for  evolutions  that  merge  or  split  existing  classes  into  new  classes  is  dis¬ 
cussed  in  the  Future  Work  section. 


Conclusions  and  Future  Work 

In  this  paper,  we  have  described  a. specification  model  for  schema  evolution  that  has 
the  following  features: 

•  Schema  versioning  instead  of  modification  to  a  single  schema,  so  that  program 
compatibility  can  be  supported,  if  desired. 

•  Compatibility  support  is  provided  at  less  runtime  cost  than  the  En^'ore  facility. 
For  each  version  of  a  class,  each  instance  has  a  corresponding  facet.  For  attributes 
which  can  be  derived  solely  from  attributes  from  other  facets,  this  facility  is  like 
a  cache,  sacrificing  space  for  time.  For  attributes  which  are  not  reflected  in  the 
representation  of  the  versions,  the  facet  provides  space  for  the  value  to  stored, 
thereby  preserving  information  that  would  be  lost  under  a  conversion  scheme. 

•  A  broader  variety  of  evolutions  are  supported  than  in  existing  systems  (ORION, 
GemStone,  Encore).  However,  not  all  evolutions,  [e.g.,  telescoping  (see  below) 
and  the  more  complex  reorganization  evolutions  of  Lerner[15]),  are  currently  pos¬ 
sible.  (See  below) 

•  Fine  tuning  of  the  adaptation  scheme  is  possible,  by  allowing  the  programmer  to 
decide  how  much  duplication  of  information  is  present  in  each  instance.  If  one 
version  is  used  very  infrequently,  the  programmer  can  save  space  and  time,  by 
emulating  its  interface  instead. 


The  remainder  of  this  section  discusses  topics  and  gOtals  for  ongoing  and  future 
research. 

Complex  Evolutions  involving  Inheritance 

There  are  some  complex  evolutions  that  we  have  not  addressed:  Idoicoping,  and  cla.ss 
splitting. 

A  telescoping  evolution  is  one  that  collects  attributes  from  the  component  classes 
and  installs  them  as  attributes  in  the  evolved  cla.ss.  (17)  The  problem  our  .scheme 
has  with  thes.  evolutions  js  that  the  derivation  function  in  such  a  case  refers  to  other 
instances  whose  value  can  change  without  the  first  instance  being  notifieil.  In  order 
to  support  this  type  of  evolution,  the  programmer  must  be  able  to  force  the  attribute 
to  be  derived  each  time  it  is  referenced,  and  a  generic  write  method  would  need  to  be 
supplied. 
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The  problem  associated  with  the  act  of  splitting  a  class  up  is  that  it  might  involve 
a  splitting  of  the  instance  collection  as  well.  We  have  yet  to  examine  how  this  might 
be  accomplished  in  our  model. 

Version  Configurations 

A  requirement  that  was  not  addressed  at  ail  in  this  paper  is  the  ability  to  version  groups 
of  classes.  For  evolutions  affecting  component  classes,  it  might  be  convenient  to  be  able 
to  collect  a  group  of  classes  into  a  version.  This  will  allow  a  class  to  specify  the  class- 
versions  of  its  attributes’  domains.  Such  a  facility  might  also  assist  in  the  support  of 
the  aforementioned  complex  evolutions. 

Programming  an  adaptation  strategy 

Our  system  as  described  has  more  versatility  than  ORION’s  facility  because  it  supports 
user-defined  instance  adaptation  information.  Consistent  with  our  desire  to  aid  the 
schema  designer,  we  would  like  to  provide  the  ability  to  install  user-defined  adaptation 
strategies  based  on  disjoint  union  data  model.  For  example,  a  database  that  must  be 
highly  available  during  business  hours  could  maintain  a  log  of  the  instances  touched 
during  the  day  and  spend  the  idle  overnight  hours  converting  them. 
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